www. sj mmf . org 



Journal of Modern Mathematics Frontier Volume 2 Issue 2, June 2013 



The Exponentiated Complementary 
Exponential Geometric Distribution 

ECEG Distribution 

Cintia Y. Yamachi 1 , Mari Roman 2 , Francisco Louzada 3 , Maria A. P. Franco 4 , Vicente G. Cancho 5 

iA4 Departamento de Estatistica, Universidade Federal de Sao Carlos/DEs, 3 ^ 5 Instituto de Ciencias Matematicas e 
Computacao, Universidade de Sao Paulo/ICMC 
Sao Carlos, SP - Brasil 

^intiayurieOyahoo. com.br; 2 mari.romanl9@gmail.com; 3 louzada@i cmc.usp.br; 4 mapfranco@uf scar.br; 
5 garibay@icmc.usp.br 



Abstract 

Recently new classes of models have appeared to model 
survival times, such as, Exponentiated Exponential 
distribution (Gupta & Kundu, 1999), an extention of Lidley 
distribution (Bakouch, Al-Zahrani, Al-Shomrani, Marchi, 
and Louzada, 2011), a generalization of Frechet, gamma, 
Gumbel (Nadarajah & Kotz, 2006), which can accommodate 
increasing, decreasing, unimodal hazard functions. In this 
paper it discussed a generalization of a distribution 
presented by Louzada, Roman and Cancho (2011), which 
arise on latent competing risks scenarios, only the maximum 
lifetime is observed among all risks, that can accommodates 
increasing hazard functions. The density, survival function 
and failure rate are presented. Statistical proprieties as 
characteristic function, moments, r-th order moment, 
quantile function, residual lifetime distribution are provided. 
The maximum likelihood estimator and inference from the 
distribution are discussed. The performance in a real dataset 
is verified by modeling different distributions to the survival 
times. 
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Introduction 

The Complementary Exponential Geometric (CEG) 
distribution was introduced by Louzada, Roman and 
Cancho (2011), this distribution presents on a latent 
complementary risks scenarios, in a situation with 
several factor which the cause of the failure is 
unknown, only the maximum lifetime value among all 
risks is observed, characterizing a complementary 
risks (CR) problem (Basu and Klein, 1982). In the 
classical CR scenarios the lifetime associated with a 
particular risk is not observable, rather we observe 
only the maximum lifetime value among all risks. 



Simplistically, in reliability, we observe only the 
maximum component lifetime of a parallel system. 
That is, the observable quantities for each component 
are the maximum lifetime values to failure among all 
risks, and the cause of failure. The CR dual are the so 
called competing risks scenarios, where the lifetime 
associated with a particular risk is not observable, 
rather we observe only the minimum lifetime values 
among all risks. Full statistical procedures and 
extensive literature are available to deal with these 
problems and interested readers can refer to Lawless 
(2003), Crowder, Kimber, Smith and Sweeting (1991) 
and Cox & Oakes (1984). 

In scenario, compositions of distributions, there are 
several works, how for instance, Adamidis & Loukas 
(1998) with Exponential Geometric distribution, the 
Poisson Exponential distribution introduced by 
Cancho, Louzada and Barriga (2011), Silva, Ortega and 
Cordeiro (2010) studied in details the beta modified 
Weibull distribution and Silva, Barreto-Souza and 
Cordeiro (2010) defined the Generalized Exponential 
Geometric distribution and the Weibull Geometric 
distribution was proposed by Barreto-Souza, Morais 
and Cordeiro (2010). 

Generalization of the the Exponential distribution is a 
widely used lifetime distribution for modeling many 
problems in lifetime testing and reliability studies. In 
recent years, several new classes of models were 
introduced grounded in its simple, elegant and close 
form, such as Gupta & Kundu (1999), Barreto-Souza & 
Cribari-Neto (2009), and also the precursor of the CEG, 
the Exponential Geometric distribution presented by 
Adamidis & Loukas (1998). 

The exponentiation of distributions is a mechanism 
that makes the model more flexible, Nadarajah & Kotz 
(2006) introduce four more exponentiated type 
distributions: the Exponentiated Gamma, 
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Exponentiated Weibull, exponentiated Gumbel and 
the Exponentiated Frechet distribution. We also, 
several authors presented exponentiated distributions, 
such as Mudholkar, Srivastava and Freimer (1995) 
with the exponentiation of the Weibull distribution, 
Gupta & Kundu (1999) with the Exponentiated 
Exponential, Barriga, Louzada and Cancho (2011) with 
the Complementary Exponential Power distribution 
which is the exponentiation of the Exponential Power 
distribution proposed by Smith & Bain (1975) denoted 
as Complementary Exponential Power distribution, 
Bakouch, Al-Zahrani, Al-Shomrani, Marchi and 
Louzada (2011) with the extension of the Lindley (EL) 
distribution and the Complementary Exponential 
Power distribution (CEP) introduced by Barriga, 
Louzada and Cancho (2011). 

In this paper, we propose a new lifetime distribution 
family, which is a generalization of the CEG 
distribution, namely the family Exponentiated 
Complementary Exponential Geometric (ECEG) 
distribution. This distribution is obtained by 
exponentiating the cumulate density function of CEG. 
It has two parameters A and of scale and shape 
respectively, its density function and cumulative 
density are given in by 

X6e ly 
(e iy (l-0) + 0y' 
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where, y>0 is the time, O<0<1 and A>0. 



The CEG distribution accommodates increasing failure 
while a new family ECEG one accommodates also 
decreasing failure rate besides increasing ones. The 
paper is organized as following. Section 2 presents the 
density, survival, cumulate and hazard function for 
this family. The proprieties as characteristic function, 
moments, r-th order moment, quantile function, 
residual lifetime distribution are presented in Section 3. 
Section 4 and Section 5 present respectively the 
inference method and application in artificial and real 
dataset. Section 6 provides some concluding remarks. 

The Eceg Distribution 

Marshall & Olkin (2007) presented the idea of 
exponentiation, based in the baseline cumulative 
distribution function Fbaseline(y) and an arbitrary 
power a>0, obtaining a new cumulate distribution 
function F(y) = [Fbaseline(y)] 11 , where a can be referred 
as a resilience parameter and F(y) is a resilience 
parameter family. The Exponentiated Complementary 



Exponential Geometric (ECEG) is obtain 
exponentiating the cumulate density function from 
CEG distribution (2) proposed by Louzada, Roman, 
and Cancho (2011), and it is given by, 



F O0 = 
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where y>0, a>0, O<0<1 and A>0. 



(3) 



The density function of a random variable with ECEG 
distribution is given by: 



f(y) = a 
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In (4) a and are shape parameters and A is the scale 
parameter. For a= 1 the ECEG is reduced to the CEG 
distribution Louzada, Roman, and Cancho (2011). For 
a=0=l the ECEG reduced to e Exponential (E) 
distribution. 

The survival function of a ECEG distribution random 
variable is given by 

0(1 -e- iy ) 
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And from (4) and (5) the hazard function, according to 



the relationship h(y)=f(y)/S(y), is given by 



h(y)- 



aXe 



0(1- 



-Xy 



(e ly (1 - 0) + <9)(1 - e- ly ) (e~ ly (\-0) + 0)'" - (0(1 - e Xy ))' 



.(6) 



And it can be increasing and decreasing as shown in 
the Figure 1. 

The ECEG distribution accommodates, unimodal, and 
a broad variety of monotone hazard functions depend 
on the parameter values over the regions of the space 
of the shape parameters. 




Fij; u rv 1 • Haiard function of the ECEG distribution. Left panel: for a - 0.9(1 :nul 
= 11.511, Risht panel: for a = 1 .60 and e = li.95 

(i) If a > 1, a0 > 1 or a0< 1, and A0 < 1 we have h(y) 
increasing. 

(ii) If a < 1, ad < 1, and A0 < 1 we have h(y) decreasing. 

Since the hazard rate function is complex, the shape of 
the hazard function was obtained by composition of 
conditions, to compare with the hypotheses from 
Glaser's theorem (Glaser, 1980), which talks about this 
subject by the following considerations: Define n(y)=- 
f'(y)/f(y), where f'(y) is the first derivative of the (4). 
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Hence, 



rj{y) = -(«-!) 



Xe 



-2e 



-0 + l + 0e~ 



v (l-6>) + 6> 



7]\y) = -(«-!) 



-/iV^l-e^'H/tV 



(1-e"*) 2 



' C-2 + 6*) + 1 - 6» 



-(«-D 



e"(l-0) + 9 ) \-e ly 
- Xe- ly (0-2)(e- iy (1 - 0) + 0) + A(l - 6>)e" (e"* (-2 + 6>) + 1- <9)~ 

{e- iy (i-0)+ey- 

And, this result, we have the conditions listed above, 
where if a>l, a0>l or a0<l, and A0<1, for every y>0, 
r]'(y)>0, consequently the hazard function h(y) is 
increasing and if a<l, a0<l, and A0<1 for every y>0, 
r]'(y)<0, therefore the failure rate function is decreasing. 

Some Properties 

Many of the most important features and 
characteristics of a distribution can be studied through 
its moments, such mean, variance. A general 
expression for r-th My =E(Y r ) ordinary moment 
of the ECEG is hard to be obtained and 

we resume the mean and variance, as it follows. 
Moment-generating of the Y variable, with density 
function given by (3) can be obtained analytically. 

For any real number t, let Ov(t) be the characteristic 
function of Y, that is, OY(t)=E(e il y), where i denotes the 
imaginary unit. With the preceding notations, we state 
the following. 

Proposition 1: For the random variable Y with ECEG 
distribution, we have that, its characteristic function is 
given by 
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where the last equality follows from the chance of 
variable z = \ — e~ ly . 

Comparing the last integral with (17), obtain: n=-it/A+l, 
b=(0-l), m=l, p=a, 1— (a+1), and making the 
appropriate substitutions, completed the proof. 

Proposition 2: If Y has ECEG distribution, then 
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and 2 F i ( y ) denotes the hypergeometric function, 
given by 

2 F l (a, b, c, z) = J (a) k (b) k I (c\ z k I k\ 

k=0 

Proof 2: Setting t=0 in (7) and considering that E(Y ' ) = 

O F (0) <r) /(", we have the first and second moments. By 
the first and second moment the variance can be easily 
found. 

The p-th quantile of the extended Complementary 
Exponential Geometric, the inverse of F(y P ) =p, is 
given by 



Q(p) = F\p) = -X'\n 



l-p" 



-p"+l 



(10) 



where p has the Uniform(0,l) distribution. 

Let Yi,Y2...,Yn a random sample take from ECEG 
distribution. Denote Yi:n, ...,Yn:n the order statistics. 
Following the density function of i-th order statistic: 



f,Jj) = 
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The r-th moment of the i-th order statistic Ym can be 
obtained from the known result, 

p ~ i V n 



(11) 
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Proposition 3: For the random variable Y with ECEG 
distribution, we have that, r-th moment of the i-th 
order statistic is given by 
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From (11) and (18), the result follows. 

The residual lifetime distribution of a random variable 
Y, distributed as ECEG,has the survival function: 

0(1- e «' + ") Y 
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The mean residual lifetime of a continuous 
distribution with survival function S(t) is given by: 

1 



ju(t) = E(Y-t\Y>y) = ^[S(u)du. 



(14) 



Proposition 4: For the random variable Y with ECEG 
distribution, we have that, the mean residual lifetime 
is given by 
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Proof 4: From (14) and using S(y) given by (5) we have 
that 
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Estimation and Inference 



0"'(l + m-a + l) 



Firstly, in order to identify the shape of a lifetime data 
failure rate function we shall consider a graphical 
method based on the TTT plot . In its empirical version 



the TTT plot is given by 

G(r I n) = [(± Y tr ) + (n - r)Y„ ] /(£ F,„ ), 

where r=l,...,n and Ym represent the order statistics of 
the sample. It has been shown that the failure rate 
function is increasing (decreasing) if the TTT plot is 
concave (convex). 

Assuming the lifetimes are independent to the 
censoring mechanism, the logarithm of the likelihood 
function is given by: 



l(a,A, 0,y i ) = ^ 1 S l \og(a 

( ^(i-g) + ^ H ' s 



^ S| \e- y '(\-0) + 



0(1 -e-* 1 ) 
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The maximum likelihood estimates (MLEs) of the 
parameters are obtained by direct maximization of the 
logarithm of the likelihood, where di is an indicator 
variable, assuming 1 when failure/death occurs and 
when censure. 

We consider the -l(.)values to find the MLE of the 
parameters using the optim function in R language (R 
development care Team, 2009), and the Akaike 
information criterion (AIC) and Bayesian information 
criterion (BIC), to compare the models, which are 
defined, respectively, by -21(.)+2q, and -21(.)+qlog(n), 
where 1(.) is the Log-Likelihood evaluated in the MLE 
vector on respective distribution, q is the number of 
parameters estimated and n is the sample size. The 
best distribution corresponds to a lower -1(.), AIC and 
BIC values. 

The information matrix is given by: 
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d 2 l(a,Z,0) 


863 a 


dddx 


dS66 



(a,X,6) = (a,X,0), 

The elements from information matrix can be obtained 
just numerically. 

Data Analysis 

In this section our methodology is illustrated in three 
dataset, two artificial and one real related to mammary 
tumors in female rats. 

Firstly a simulation study was performed to see the 
behavior of the modeling of different distributions to 



81 



www. sj mmf . org 



Journal of Modern Mathematics Frontier Volume 2 Issue 2, June 2013 



the survival times generated from ECEG distribution. 
We generate two samples of the ECEG distribution by 
considering the inverse transformation of cumulated 
density Function (10), Data 1 (Dl) and Data 2 (D2), 
with the same sample size and censuring percentage, 
n=20 and p=0.1, respectively. For Dl we 
consider(a,A,0)=(4.2323, 0.07015, 0.50) with increasing 
hazard, the TTT plot is shown in Figure 3 (left), and 
for D2 the parametric vector is (a,A,0)=(O.3O, 0.0005, 
0.975) with decreasing hazard (Figure 3, right). 

We fit the ECEG distribution to the dataset and 
compare its fitting with its particular cases, the 
Complementary Exponential Geometric distribution 
with pdf given by f(x)=A0e- Ax (e- Ax (l-0)+0)- 2 , where A>0 
is a scale parameter and 0<6<1 is a shape parameter 
and the Exponential (E) distribution with pdf given by 
f(x)=l/0e- t/e , where the rate parameter is 0. 

The distributions are compared considering the 
-/(.) = -log L(d, X, 6) vameS/ t he Akaike Information 
Criterion (AIC) and the Bayesian Information 
Criterion (BIC), defined respectively by -21(.)+2q and 
by -21(.)+qlog(n), where ( a >^,0) are the MLEs vector, q 
is the number of parameters estimated and n is the 
sample size. The best distribution corresponds to 
lower -1(.), AIC and BIC values. 

The MLEs of the parameters a, A and of the ECEG 
distribution are given, respectively, by 4.025, 0.870 and 
0.003 for Dl, and 0.205, 0.001 and 0.103 for D2. 

Table I shows the values of the statistics -1(.), AIC, BIC 
and the Kolmogorov Smirnov (K-S) statistics with 
their p-values which are evidence in favor of the 
model ECEG distribution. 



The Figure 4 shows the Survival curve of the fitted 
models, corroborating the values in Table 1. 





Figure 2 - TTT plot from artificial duliisct Dl (left) and D2 (right) 

TABLE 1 VALUES OF CRITERION INFORMATION FOR 
SIMULATED SAMPLE: DATA 1 AND DATA 2 





Dl 


D2 




ECEG 


CEG 


E 


ECEG 


CEG 


E 


-1(.) 


31.129 


33.298 


57.934 


57.934 


140.886 


138.867 


AIC 


68.258 


70.597 


117.869 


257.158 


285.772 


279.735 


BIC 


71.245 


72.589 


118.865 


260.146 


287.764 


280.731 


K-S 


0.197 


0.253 


0.423 


0.214 


0.429 


0.410 


p-value 


0.414 


0.151 


0.001 


0.317 


0.001 


0.002 





E'ijiure 3 - I'llted survival J'untjtisjns <]t Ihc CE'Xi, I'! [hsLrihuLions Kupe ri imposed KHlic: 

Kaplan-Meier fir from Mala I (lefl) ami llalii 1 (rijiht) 

As a real example consider the dataset is from Lawless 
(2003), which consists in data from a nine-month study 
on the effect of known caminogens DES and DMBA in 
the induction of mammary tumors in female rats 
(Shellabarger, McKnight, Stone and Holtzman, 1980). 

Table 2 shows the parameter MLEs according to each 
one of the five fitted distributions, and the values of 
the AIC, BIC and -1(.), and also the K-S statistics with 
their p-values. The ECEG distribution outperforms its 
concurrent distributions in all considered criterion, 
corroborating the fact that the ECEG distribution can 
be seen as a competitive distribution of practical 
significance for the analysis of survival data. 

These conclusions are corroborated by the fitted 
survival functions of the ECEG, CEG, E distributions 
superimposed to the Kaplan-Meier fit. We observed a 
clear difference between the fitted curves, which is a 
strong motivation for choosing the most suitable 
distribution for fitting the data. 

TABLE 2 ESTIMATIVE AND VALUES OF CRITERION 
INFORMATION FROM REAL DATASET 



Mode 
1 


Parameter 

s 

(a,A,6) 


AIC 


BIC 


-l(-) 


K-S 
Stat 


P- 
value 


ECEG 


21.091; 
0.023; 
0.971 


211.358 


214.765 


102.679 


0.214 


0.238 


CEG 


-; 0.030; 
0.011 


215.959 


218.230 


105.979 


0.225 


0.194 


E 


-;175.860 ; - 


236.455 


237.591 


117.227 


0.263 


0.082 





Figure 4 -Real data: TTT plot [left] and fitted survival functions of The ECEG, CEG t E 
distributions superimposed to the Kaplan-Meier tit (right) 



Conclusion Remarks 

In this paper we introduce an extension of the 
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Complementary Exponential Geometric distribution. 
The new distribution is much more flexible than its 
predecessor CEG distribution, presenting increasing 
and decreasing shaped failure rates. We provide 
statistical properties of the ECEG distribution 
including reliability measures, the density, failure rate, 
moments, quantiles, order statistics. Estimation via 
maximum likelihood is straightforward. Artificial and 
real data applications of the ECEG distribution shows 
that it could provide a better fit than its particular 
cases. 
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Appendix 

Through the paper we shall use the following 
equations: 

fz^a-zra + fe-)'A=r ( ,)i:fH *; r ^ + ^ . (i7) 

o t-o \k J L(p + n + km) 

where T(a,b) is the incomplete Gamma function given 

by r(a,b)=jy~V<fr. 



i-o k\ 



(18) 



where (r)k is a Pochhammer symbol, given (r)k= r(r+l) 

...(r-k+1) and if lxl<l the series converge, and (-r)k = (- 
l)k(r-k+l)k . 
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